6 research outputs found

    TREATMENT OF INFLUENTIAL OBSERVATIONS IN THE CURRENT EMPLOYMENT STATISTICS SURVEY

    Get PDF
    It is common for many establishment surveys that a sample contains a fraction of observations that may seriously affect survey estimates. Influential observations may appear in the sample due to imperfections of the survey design that cannot fully account for the dynamic and heterogeneous nature of the population of businesses. An observation may become influential due to a relatively large survey weight, extreme value, or combination of the weight and value. We propose a Winsorized estimator with a choice of cutoff points that guarantees that the resulting mean squared error is lower than the variance of the original survey weighted estimator. This estimator is based on very un-restrictive modeling assumptions and can be safely used when the sample is sufficiently large. We consider a different approach when the sample is small. Estimation from small samples generally relies on strict model assumptions. Robustness here is understood as insensitivity of an estimator to model misspecification or to appearance of outliers. The proposed approach is a slight modification of the classical linear mixed model application to small area estimation. The underlying distribution of the random error term is a scale mixture of two normal distributions. This setup can describe outliers in individual observations. It is also suitable for a more general situation where units from two distinct populations are put together for estimation. The mixture group indicator is not observed. The probabilities of observations coming from a group with a smaller or larger variance are estimated from the data. These conditional probabilities can serve as the basis for a formal test on outlyingness at the area level. Simulations are carried out to compare several alternative estimators under different scenarios. Performance of the bootstrap method for prediction confidence intervals is investigated using simulations. We also compare the proposed method with alternative existing methods in a study using data from the Current Employment Statistics Survey conducted by the U.S. Bureau of Labor Statistics

    Methods for Combining Probability and Nonprobability Samples Under Unknown Overlaps

    Full text link
    Nonprobability (convenience) samples are increasingly sought to stabilize estimations for one or more population variables of interest that are performed using a randomized survey (reference) sample by increasing the effective sample size. Estimation of a population quantity derived from a convenience sample will typically result in bias since the distribution of variables of interest in the convenience sample is different from the population. A recent set of approaches estimates conditional (on sampling design predictors) inclusion probabilities for convenience sample units by specifying reference sample-weighted pseudo likelihoods. This paper introduces a novel approach that derives the propensity score for the observed sample as a function of conditional inclusion probabilities for the reference and convenience samples as our main result. Our approach allows specification of an exact likelihood for the observed sample. We construct a Bayesian hierarchical formulation that simultaneously estimates sample propensity scores and both conditional and reference sample inclusion probabilities for the convenience sample units. We compare our exact likelihood with the pseudo likelihoods in a Monte Carlo simulation study.Comment: 32 pages, 8 figure
    corecore